A Dataset for ICD-10 Coding of Death Certificates: Creation and Usage

نویسندگان

  • Thomas Lavergne
  • Aurélie Névéol
  • Aude Robert
  • Cyril Grouin
  • Grégoire Rey
  • Pierre Zweigenbaum
چکیده

Very few datasets have been released for the evaluation of diagnosis coding with the International Classification of Diseases, and only one so far in a language other than English. This paper describes a large-scale dataset prepared from French death certificates, and the problems which needed to be solved to turn it into a dataset suitable for the application of machine learning and natural language processing methods of ICD-10 coding. The dataset includes the free-text statements written by medical doctors, the associated meta-data, the human coder-assigned codes for each statement, as well as the statement segments which supported the coder’s decision for each code. The dataset comprises 93,694 death certificates totalling 276,103 statements and 377,677 ICD-10 code assignments (3,457 unique codes). It was made available for an international automated coding shared task, which attracted five participating teams. An extended version of the dataset will be used in a new edition of the shared task.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

بررسی تأثیر خطاهای تکمیل گواهی فوت بر کدگذاری علت زمینه ای مرگ در بیمارستان شهید محمدی بندرعباس

     Introduction: Death information plays a critical role in the adjustment of health plans, and the cause of death coding leads to organizing this information .The Purpose of this study was to review the impact of errors in the completion of death certificate on underlying the cause of death coding in Shahid Mohammadi hospital in Bandarabbas.   Methods : This descriptive-cross sectional study...

متن کامل

Differences among official statistics of mortality rates in Iran

Dear Editor, Death indicators and causes of death are both closely associated with socio-cultural, economic, and structural factors and determinants of health, with all of which being at the core of the planning, monitoring, and assessment of intervention programs in any healthcare system (1). In Iran, the diagnosis and official registration of deaths are carried out by two independent organiz...

متن کامل

مطالعه توصیفی گواهی‌های فوت و جواز دفن از نظر معیارهای سازمان بهداشت جهانی و وزارت بهداشت درمان و آموزش پزشکی: گزارش کوتاه

Background: The death certificate is a document consisting of the deceased individual’s basic information and identification which is filled out, registered and signed by a doctor. the World health organization’s policies in their health planning, provide a suitable database with knowledge of the required elements for planners and other authorized information demanders. During a multi-year coop...

متن کامل

Deaths: Leading Causes for 2011-2012

Introduction: This report presents 2011-2012 data on leading causes of death in the Islamic Republic of Iran by Province. Methods: Data in this report are based on information from all death certificates filed in the 31 provinces in 2011-2012 . Causes of death classified by the International Classification of Diseases, Tenth Revision (ICD–10) are ranked a...

متن کامل

Fatal anaphylaxis registries data support changes in the who anaphylaxis mortality coding rules

Anaphylaxis is defined as a severe life-threatening generalized or systemic hypersensitivity reaction. The difficulty of coding anaphylaxis fatalities under the World Health Organization (WHO) International Classification of Diseases (ICD) system is recognized as an important reason for under-notification of anaphylaxis deaths. On current death certificates, a limited number of ICD codes are va...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016